47 research outputs found
Calibration of an Elastic Humanoid Upper Body and Efficient Compensation for Motion Planning
High absolute accuracy is an essential prerequisite for a humanoid robot to
autonomously and robustly perform manipulation tasks while avoiding obstacles.
We present for the first time a kinematic model for a humanoid upper body
incorporating joint and transversal elasticities. These elasticities lead to
significant deformations due to the robot's own weight, and the resulting model
is implicitly defined via a torque equilibrium. We successfully calibrate this
model for DLR's humanoid Agile Justin, including all Denavit-Hartenberg
parameters and elasticities. The calibration is formulated as a combined
least-squares problem with priors and based on measurements of the end effector
positions of both arms via an external tracking system. The absolute position
error is massively reduced from 21mm to 3.1mm on average in the whole
workspace. Using this complex and implicit kinematic model in motion planning
is challenging. We show that for optimization-based path planning, integrating
the iterative solution of the implicit model into the optimization loop leads
to an elegant and highly efficient solution. For mildly elastic robots like
Agile Justin, there is no performance impact, and even for a simulated highly
flexible robot with 20 times higher elasticities, the runtime increases by only
30%
Bringing a Humanoid Robot Closer to Human Versatility : Hard Realtime Software Architecture and Deep Learning Based Tactile Sensing
For centuries, it has been a vision of man to create humanoid robots, i.e., machines that not only resemble the shape of the human body, but have similar capabilities, especially in dextrously manipulating their environment. But only in recent years it has been possible to build actual humanoid robots with many degrees of freedom (DOF) and equipped with torque controlled joints, which are a prerequisite for sensitively acting in the world. In this thesis, we extend DLR's advanced mobile torque controlled humanoid robot Agile Justin into two important directions to get closer to human versatility. First, we enable Agile Justin, which was originally built as a research platform for dextrous mobile manipulation, to also be able to execute complex dynamic manipulation tasks. We demonstrate this with the challenging task of catching up to two simultaneously thrown balls with its hands. Second, we equip Agile Justin with highly developed and deep learning based tactile sensing capabilities that are critical for dextrous fine manipulation. We demonstrate its tactile capabilities with the delicate task of identifying an objects material simply by gently sweeping with a fingertip over its surface. Key for the realization of complex dynamic manipulation tasks is a software framework that allows for a component based system architecture to cope with the complexity and parallel and distributed computational demands of deep sensor-perception-planning-action loops -- but under tight timing constraints. This thesis presents the communication layer of our aRDx (agile robot development -- next generation) software framework that provides hard realtime determinism and optimal transport of data packets with zero-copy for intra- and inter-process and copy-once for distributed communication. In the implementation of the challenging ball catching application on Agile Justin, we take full advantage of aRDx's performance and advanced features like channel synchronization. Besides developing the challenging visual ball tracking using only onboard sensing while everything is moving and the automatic and self-contained calibration procedure to provide the necessary precision, the major contribution is the unified generation of the reaching motion for the arms. The catch point selection, motion planning and the joint interpolation steps are subsumed in one nonlinear constrained optimization problem which is solved in realtime and allows for the realization of different catch behaviors. For the highly sensitive task of tactile material classification with a flexible pressure-sensitive skin on Agile Justin's fingertip, we present our deep convolutional network architecture TactNet-II. The input is the raw 16000 dimensional complex and noisy spatio-temporal tactile signal generated when sweeping over an object's surface. For comparison, we perform a thorough human performance experiment with 15 subjects which shows that Agile Justin reaches superhuman performance in the high-level material classification task (What material id?), as well as in the low-level material differentiation task (Are two materials the same?). To increase the sample efficiency of TactNet-II, we adapt state of the art deep end-to-end transfer learning to tactile material classification leading to an up to 15 fold reduction in the number of training samples needed. The presented methods led to six publication awards and award finalists and international media coverage but also worked robustly at many trade fairs and lab demos
Efficient Learning of Fast Inverse Kinematics with Collision Avoidance
Fast inverse kinematics (IK) is a central component in robotic motion
planning. For complex robots, IK methods are often based on root search and
non-linear optimization algorithms. These algorithms can be massively sped up
using a neural network to predict a good initial guess, which can then be
refined in a few numerical iterations. Besides previous work on learning-based
IK, we present a learning approach for the fundamentally more complex problem
of IK with collision avoidance. We do this in diverse and previously unseen
environments. From a detailed analysis of the IK learning problem, we derive a
network and unsupervised learning architecture that removes the need for a
sample data generation step. Using the trained network's prediction as an
initial guess for a two-stage Jacobian-based solver allows for fast and
accurate computation of the collision-free IK. For the humanoid robot, Agile
Justin (19 DoF), the collision-free IK is solved in less than 10 milliseconds
(on a single CPU core) and with an accuracy of 10^-4 m and 10^-3 rad based on a
high-resolution world model generated from the robot's integrated 3D sensor.
Our method massively outperforms a random multi-start baseline in a benchmark
with the 19 DoF humanoid and challenging 3D environments. It requires ten times
less training time than a supervised training method while achieving comparable
results.Comment: Presented at the 2023 IEEE-RAS International Conference on Humanoid
Robot
Combining Shape Completion and Grasp Prediction for Fast and Versatile Grasping with a Multi-Fingered Hand
Grasping objects with limited or no prior knowledge about them is a highly
relevant skill in assistive robotics. Still, in this general setting, it has
remained an open problem, especially when it comes to only partial
observability and versatile grasping with multi-fingered hands. We present a
novel, fast, and high fidelity deep learning pipeline consisting of a shape
completion module that is based on a single depth image, and followed by a
grasp predictor that is based on the predicted object shape. The shape
completion network is based on VQDIF and predicts spatial occupancy values at
arbitrary query points. As grasp predictor, we use our two-stage architecture
that first generates hand poses using an autoregressive model and then
regresses finger joint configurations per pose. Critical factors turn out to be
sufficient data realism and augmentation, as well as special attention to
difficult cases during training. Experiments on a physical robot platform
demonstrate successful grasping of a wide range of household objects based on a
depth image from a single viewpoint. The whole pipeline is fast, taking only
about 1 s for completing the object's shape (0.7 s) and generating 1000 grasps
(0.3 s).Comment: 8 pages, 10 figures, 3 tables, 1 algorithm, 2023 IEEE-RAS
International Conference on Humanoid Robots (Humanoids), Project page:
https://dlr-alr.github.io/2023-humanoids-completio
Learning-Based Real-Time Torque Prediction for Grasping Unknown Objects with a Multi-Fingered Hand
When grasping objects with a multi-finger hand, it is crucial for the grasp stability to apply the correct torques at each joint so that external forces are countered. Most current systems use simple heuristics instead of modeling the required torque correctly. Instead, we propose a learning-based approach that is able to predict torques for grasps on unknown objects in real-time. The neural network, trained end-to-end using supervised learning, is shown to predict torques that are more efficient, and the objects are held with less involuntary movement compared to all tested heuristic baselines. Specifically, for 90 % of the grasps the translational deviation of the object is below 2.9 mm and the rotational below 3.1°. To generate training data, we formulate the analytical computation of torques as an optimization problem and handle the indeterminacy of multi-contacts using an elastic model. We further show that the network generalizes to predict torques for unknown objects on the real robot system with an inference time of 1.5 ms
Speeding Up Optimization-based Motion Planning through Deep Learning
Planning collision-free motions for robots with many degrees of freedom is challenging in environments with complex obstacle geometries. Recent work introduced the idea of speeding up the planning by encoding prior experience of successful motion plans in a neural network. However, this 'neural motion planning' did not scale to complex robots in unseen 3D environments as needed for real-world applications. Here, we introduce 'basis point set', well-known in computer vision, to neural motion planning as a modern compact environment encoding enabling efficient supervised training networks that generalize well over diverse 3D worlds. Combined with a new elaborate training scheme, we reach a planning success rate of 100 %. We use the network to predict an educated initial guess for an optimization-based planner (OMP), which quickly converges to a feasible solution, massively outperforming random multi-starts when tested on previously unseen environments. For the DLR humanoid Agile Justin with 19 DoF and in challenging obstacle environments, optimal paths can be generated in 200 ms using only a single CPU core. We also show a first successful real-world experiment based on a high-resolution world model from an integrated 3D sensor
Learning a State Estimator for Tactile In-Hand Manipulation
We study the problem of estimating the pose of an object which is being manipulated by a multi-fingered robotic hand by only using proprioceptive feedback.
To address this challenging problem, we propose a novel variant of differentiable particle filters, which combines two key extensions.
First, our learned proposal distribution incorporates recent measurements in a way that mitigates weight degeneracy.
Second, the particle update works on non-euclidean manifolds like Lie-groups, enabling learning-based pose estimation in 3D on SE(3).
We show that the method can represent the rich and often multi-modal distributions over poses that arise in tactile state estimation.
The models are trained in simulation, but by using domain randomization, we obtain state estimators that can be employed for pose estimation on a real robotic hand (equipped with joint torque sensors).
Moreover, the estimator runs fast, allowing for online usage with update rates of more than 100 Hz on a single CPU core.
We quantitatively evaluate our method and benchmark it against other approaches in simulation.
We also show qualitative experiments on the real torque-controlled DLR-Hand II
Self-Contained Calibration of an Elastic Humanoid Upper Body with a Single Head-Mounted RGB Camera
When a humanoid robot performs a manipulation task, it
first makes a model of the world using its visual sensors
and then plans the motion of its body in this model. For
this, precise calibration of the camera parameters and the
kinematic tree is needed. Besides the accuracy of the
calibrated model, the calibration process should be fast
and self-contained, i.e., no external measurement equipment
should be used. Therefore, we extend our prior work on
calibrating the elastic upper body of DLR's Agile Justin by
now using only its internal head-mounted RGB camera. We use
simple visual markers at the ends of the kinematic chain
and one in front of the robot, mounted on a pole, to get
measurements for the whole kinematic tree. To ensure that
the task-relevant cartesian error at the end-effectors is
minimized, we introduce virtual noise to fit our imperfect
robot model so that the pixel error has a higher weight if
the marker is further away from the camera. This correction
reduces the cartesian error by more than 20%, resulting in
a final accuracy of 3.9mm on average and 9.1mm in the worst
case. This way, we achieve the same precision as in our
previous work, where an external cartesian tracking system
was used
A Two-Stage Learning Architecture That Generates High-Quality Grasps for a Multi-Fingered Hand
In this work, we investigate the problem of planning stable grasps for object manipulations using an 18-DOF robotic hand with four fingers. The main challenge here is the high-dimensional search space, and we address this problem using a novel two-stage learning process. In the first stage, we train an autoregressive network called the hand-pose-generator, which learns to generate a distribution of valid 6D poses of the palm for a given volumetric object representation. In the second stage, we employ a network that regresses 12D finger positions and scalar grasp qualities from given object representations and palm poses. To train our networks, we use synthetic training data generated by a novel grasp planning algorithm, which also proceeds stage-wise: first the palm pose, then the finger positions. Here, we devise a Bayesian Optimization scheme for the palm pose and a physics-based grasp pose metric to rate stable grasps. In experiments on the YCB benchmark data set, we show a grasp success rate of over 83%, as well as qualitative results on real scenarios of grasping unknown objects